In [ ]:
seaborn:
it is a python library which is used to build interactive vizulaizations
this library makes us easy to understand the statistical representation
we can easily understand the patterns and relationships inside the data
with the help of this vizualizations we can easily take some data driven decisions
In [ ]:
seaborn has different categories while plotting the data
categorical plots
distribution plots
regression plots
matrix plots
relational plots
multi plot grids
In [ ]:
categorical plots:
it is used to vizualize the data on different categories

barplot()
boxplot()
violinplot()
countplot()
stripplot()
factorplot()
In [ ]:
distribution plots:
whenever we want to see the distributions between univaraite and bivariate analysis

jointplot
distplot
pairplot
rugplot
In [ ]:
regression plot:
it is used to show the data columns relationships in a linear way

lmplot
regplot
In [ ]:
matrix plot:

it is used to show the data in a matrix form

heatmap
In [ ]:
relational plot:
it is used to show the relationships between two variables

relplot()
scatterplot()
lineplot()
In [ ]:
multi plot grids:

it is used to vizualize multiple instances of the same plot on different subsets of data

facetgrid
In [ ]:
barplot:it is used to display the aggregated values(mean,min,max) for different categories of data
lineplot:it is used to display the relationship between two numeric columns in a continuous line
scatterplot:it is used to show the relationship between two columns(input/output)
countplot:it is used to show the count of observations in each category
histogram:it is used to show the distributions of the data
kde:it is used to show the kernel density estimator how the curve of the data is representating the viz
boxplot:used to display the outliers inside the data
violinplot:combination of boxplot and kde plot
heatmap:it is used to show the correlation of the data in a matrix form
pairplot:used to  show all the numeric columns(scatterplot,kde,hist)
lmpoint:scatter plot with regression line
In [ ]:
pip install seaborn
In [1]:
import seaborn as sns
In [3]:
import numpy as np
import pandas as pd

# Set random seed for reproducibility
np.random.seed(42)

# Create synthetic data
n = 300

data = pd.DataFrame({
    "Date": pd.date_range(start="2024-01-01", periods=n, freq="D"),
    "Region": np.random.choice(["North", "South", "East", "West"], n),
    "Category": np.random.choice(["Electronics", "Clothing", "Home", "Sports"], n),
    "Sales": np.random.normal(500, 120, n).round(2),
    "Profit": np.random.normal(80, 30, n).round(2),
    "Quantity": np.random.randint(1, 20, n),
    "Discount": np.random.uniform(0, 0.3, n).round(2),
    "Customer_Age": np.random.randint(18, 65, n)
})
In [5]:
data
Out[5]:
Date Region Category Sales Profit Quantity Discount Customer_Age
0 2024-01-01 East Electronics 505.47 78.02 4 0.05 25
1 2024-01-02 West Electronics 421.81 43.67 18 0.26 39
2 2024-01-03 North Home 757.27 60.44 5 0.07 61
3 2024-01-04 East Clothing 576.07 81.42 16 0.29 53
4 2024-01-05 East Clothing 256.98 54.19 1 0.10 22
... ... ... ... ... ... ... ... ...
295 2024-10-22 West Sports 505.77 64.31 14 0.30 42
296 2024-10-23 South Electronics 531.17 67.39 11 0.20 26
297 2024-10-24 East Home 391.48 71.55 18 0.17 19
298 2024-10-25 North Sports 576.63 39.67 12 0.22 32
299 2024-10-26 West Clothing 300.62 52.44 12 0.14 20

300 rows × 8 columns

In [7]:
sns.histplot(data["Sales"],kde=True)
Out[7]:
<Axes: xlabel='Sales', ylabel='Count'>
No description has been provided for this image
In [11]:
sns.boxplot(x="Category",y="Sales",data=data)
Out[11]:
<Axes: xlabel='Category', ylabel='Sales'>
No description has been provided for this image
In [15]:
sns.barplot(x="Region",y="Profit",data=data)
Out[15]:
<Axes: xlabel='Region', ylabel='Profit'>
No description has been provided for this image
In [17]:
sns.scatterplot(x="Discount",y="Sales",data=data)
Out[17]:
<Axes: xlabel='Discount', ylabel='Sales'>
No description has been provided for this image
In [23]:
sns.countplot(x="Region",data=data)
Out[23]:
<Axes: xlabel='Region', ylabel='count'>
No description has been provided for this image
In [27]:
sns.jointplot(data)
Out[27]:
<seaborn.axisgrid.JointGrid at 0x1a47ad158b0>
No description has been provided for this image
In [29]:
sns.pairplot(data=data)
Out[29]:
<seaborn.axisgrid.PairGrid at 0x1a47ab6cec0>
No description has been provided for this image
In [ ]:
what is seaborn how it is different from matplotlib?
how to install seaborn library?
what is the difference between countplot() and barplot()
how do you create a boxplot in seaborn?
what is the use of hue parameter?
what is kdeplot?
what is the use of jointplot?
what is pairplot() what is the importance of pairplot?
what is the difference between distplot() and histplot()?
how do you create heatmap?
what are the different categories available in seaborn library?
how do we create a histplot?
how do you vizulaize the correlation matrix?
In [ ]:
scenerio based questions:
1.sales analysis
you need to create a sales data where the data should contains order date

you have monthly sales for 5 years

which plot will use to show the trends?

how will we compare sales across different regions?


2.Hr dataset

we want to check salary distribution by department

which plot you will use and why?
how to check the outliers in the dataset?
what is the importance of outliers and how it will impact on ml models?


3.health care dataset

cancer dataset

how many people affected with cancer?
gender(male/female)---cancer affecting gender
age(child/adult)---cancer affecting age group


4.banking dataset

you want to check correlation between the features

how will you vizualize correlation?
which plot should be use and why?
what is correlation?
how you can understand whether your data is strongly correlated or not
In [ ]:
dashboard---multiple plots

streamlit ---web application

dashboard with the help of library i.e plotly
In [ ]:
plotly:it is one of the library which is used to vizualize the data in more interactive way
In [ ]:
pip install plotly
In [31]:
import plotly.express as px
In [33]:
d1=px.bar(data,x="Category",y="Sales",color="Region")
In [35]:
 
In [ ]: